Document Image Dewarping Contest
نویسندگان
چکیده
Dewarping of documents captured with hand-held cameras in an uncontrolled environment has triggered a lot of interest in the scientific community over the last few years and many approaches have been proposed. However, there has been no comparative evaluation of different dewarping techniques so far. In an attempt to fill this gap, we have organized a page dewarping contest along with CBDAR 2007. We have created a dataset of 102 documents captured with a hand-held camera and have made it freely available online. We have prepared text-line, text-zone, and ASCII text ground-truth for the documents in this dataset. Three groups participated in the contest with their methods. In this paper we present an overview of the approaches that the participants used, the evaluation measure, and the dataset used in the contest. We report the performance of all participating methods. The evaluation shows that none of the participating methods was statistically significantly better than any other participating method.
منابع مشابه
Document Image Dewarping Based on Text Line Detection and Surface Modeling (RESEARCH NOTE)
Document images produced by scanner or digital camera, usually suffer from geometric and photometric distortions. Both of them deteriorate the performance of OCR systems. In this paper, we present a novel method to compensate for undesirable geometric distortions aiming to improve OCR results. Our methodology is based on finding text lines by dynamic local connectivity map and then applying a l...
متن کاملAn Image Based Performance Evaluation Method for Page Dewarping Algorithms Using SIFT Features
Dewarping of camera-captured document images is one the important preprocessing steps before feeding them to a document analysis system. Over the last few years, many approaches have been proposed for document image dewarping. Usually optical character recognition (OCR) based and/or feature based approaches are used for the evaluation of dewarping algorithms. OCR based evaluation is a good meas...
متن کاملRidges Based Curled Textline Region Detection from Grayscale Camera-Captured Document Images
As compared to scanners, cameras offer fast, flexible and non-contact document imaging, but with distortions like uneven shading and warped shape. Therefore, camera-captured document images need preprocessing steps like binarization and textline detection for dewarping so that traditional document image processing steps can be applied on them. Previous approaches of binarization and curled text...
متن کاملDewarping of Document Images using Coupled-Snakes
Traditional OCR systems are designed for planar (dewarped) images and the accuracy is reduced when applied on warped images. Therefore, developing new OCR techniques for warped images or developing dewarping techniques are the possible solutions for improving OCR accuracy camera-captured documents. Among different types of dewarping techniques, curled textlines information based dewarping techn...
متن کاملBorder Noise Removal of Camera-Captured Document Images Using Page Frame Detection
Camera-captured document images usually contain two main types of marginal noise: textual noise (coming from neighboring pages) and non-textual noise (resulting from the page surrounding and/or binarization process). These types of marginal noise degrade the performance of the preprocessing (dewarping) of camera-captured document images and subsequent document digitization/recognition processes...
متن کامل